Efficient Computation of Mean Truncated Hitting Times on Very Large Graphs
نویسندگان
چکیده
Previous work has shown the effectiveness of random walk hitting times as a measure of dissimilarity in a variety of graph-based learning problems such as collaborative filtering, query suggestion or finding paraphrases. However, application of hitting times has been limited to small datasets because of computational restrictions. This paper develops a new approximation algorithm with which hitting times can be computed on very large, disk-resident graphs, making their application possible to problems which were previously out of reach. This will potentially benefit a range of large-scale problems.
منابع مشابه
A Tractable Approach to Finding Closest Truncated-commute-time Neighbors in Large Graphs
Recently there has been much interest in graph-based learning, with applications in collaborative filtering for recommender networks, link prediction for social networks and fraud detection. These networks can consist of millions of entities, and so it is very important to develop highly efficient techniques. We are especially interested in accelerating random walk approaches to compute some ve...
متن کاملHitting and commute times in large graphs are often misleading
Next to the shortest path distance, the second most popular distance function between vertices in a graph is the commute distance (resistance distance). For two vertices u and v, the hitting time Huv is the expected time it takes a random walk to travel from u to v. The commute time is its symmetrized version Cuv = Huv + Hvu. In our paper we study the behavior of hitting times and commute dista...
متن کاملSupplementary materials and proofs
2 Hitting times 3 2.1 Typical hitting times are large . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Exponential mixing on spatial graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Expected hitting times degenerate to the stationary distribution . . . . . . . . . . . . . . . . 6 2.4 The case of one dimension . . . . . . . . . . . . . . . ....
متن کاملApplication of adaptive sampling in fishery part 2: Truncated adaptive cluster sampling designs
There are some experiences that researcher come across quite number of time for very large networks in the initial samples such that they cannot finish the sampling procedure. Two solutions have been proposed and used by marine biologists which we discuss in this article: i) Adaptive cluster sampling based on order statistics with a stopping rule, ii) Restricted adaptive cluster sampling. Until...
متن کاملHitting and commute times in large random neighborhood graphs
In machine learning, a popular tool to analyze the structure of graphs is the hitting time and the commute distance (resistance distance). For two vertices u and v, the hitting time Huv is the expected time it takes a random walk to travel from u to v. The commute distance is its symmetrized version Cuv = Huv +Hvu. In our paper we study the behavior of hitting times and commute distances when t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1304.4371 شماره
صفحات -
تاریخ انتشار 2013